Modeling Regular Replacement for String Constraint Solving
نویسندگان
چکیده
Bugs in user input sanitation of software systems often lead to vulnerabilities. Among them many are caused by improper use of regular replacement. This paper presents a precise modeling of various semantics of regular substitution, such as the declarative, finite, greedy, and reluctant, using finite state transducers (FST). By projecting an FST to its input/output tapes, we are able to solve atomic string constraints, which can be applied to both the forward and backward image computation in model checking and symbolic execution of text processing programs. We report several interesting discoveries, e.g., certain fragments of the general problem can be handled using less expressive deterministic FST. A compact representation of FST is implemented in SUSHI, a string constraint solver. It is applied to detecting vulnerabilities in web applications.
منابع مشابه
On Simple Linear String Equations
This paper presents a novel backward constraint solving technique for analyzing text processing programs. String constraints are represented using a variation of word equation called Simple Linear String Equation (SLSE). SLSE supports precise modeling of various regular string substitution semantics in Java regex, which allows it to capture user input validation operations widely used in web ap...
متن کاملSimple Linear String Constraints 1
Modern web applications often suffer from command injection attacks such as Cross-Site Scripting and SQL Injection. Even equipped with sanitation code, many systems can still be penetrated due to the existence of software bugs (see e.g., the Samy Worm). It is desirable to automatically discover such vulnerabilities, given the bytecode of a web application. One solution would be symbolically exe...
متن کاملA String Constraint Solver for Detecting Web Application Vulnerability
Given the bytecode of a software system, is it possible to automatically generate attack signatures that reveal its vulnerabilities? A natural solution would be symbolically executing the target system and constructing constraints for matching path conditions and attack patterns. Clearly, the constraint solving technique is the key to the above research. This paper presents Simple Linear String...
متن کاملAn SMT-LIB Format for Sequences and Regular Expressions
Abstract Strings are ubiquitous in software. Tools for verification and testing of software rely in various degrees on reasoning about strings. Web applications are particularly important in this context since they tend to be string-heavy and have large number security errors attributable to improper string sanitzation and manipulations. In recent years, many string solvers have been implemente...
متن کاملAn Evaluation of Automata Algorithms for String Analysis
There has been significant recent interest in automated reasoning techniques, in particular constraint solvers, for string variables. These techniques support a wide variety of clients, ranging from static analysis to automated testing. The majority of string constraint solvers rely on finite automata to support regular expression constraints. For these approaches, performance depends criticall...
متن کامل